Detecting Promotional Content in Wikipedia
نویسندگان
چکیده
This paper presents an approach for detecting promotional content in Wikipedia. By incorporating stylometric features, including features based on n-gram and PCFG language models, we demonstrate improved accuracy at identifying promotional articles, compared to using only lexical information and metafeatures.
منابع مشابه
Towards Detecting Wikipedia Task Contexts
Wikipedia is a resource used by many people for many different purposes. We posit that it might be beneficial to alter the content or the way content is presented depending on the task context. Here we describe a small pilot lab study to investigate features of interaction that might help to infer the contextual situation surrounding wikipedia search tasks. We describe our effort to collect dat...
متن کاملUsing Language Models to Detect Wikipedia Vandalism
This paper explores a statistical language modeling approach for detecting Wikipedia vandalism. Wikipedia is a popular and influential collaborative information system. The collaborative nature of authoring, as well as the high visibility of its content, have exposed Wikipedia articles to vandalism, defined as malicious editing intended to compromise the integrity of the content of articles. Ex...
متن کاملThe Workshops of the Tenth International AAAI Conference on Web and Social Media
Event detection in social media usually exploits information from social-networking platforms, such as Twitter or Facebook. However, previous research has suggested that Wikipedia constitutes a valuable source of information for the task of detecting breaking news. In this work we adapt a graph-based algorithm to the Wikipedia context, and compare it to the state-of-the-art Wikipedia real-time ...
متن کاملDetecting Controversial Articles in Wikipedia
In this paper, we apply graphical models to facilitate quantitative and qualitative investigations into the edit history of articles posted on Wikipedia. Quantitatively, we use the models to measure controversy arising from Wikipedia articles. Qualitatively, we use the models to provide insights into the distribution of editor roles associated with articles. The paper includes exercises that ca...
متن کاملIdentifying , Understanding and Detecting Recurring , Harmful Behavior Patterns in Collaborative Wikipedia Editing – Doctoral Proposal – Fabian
In this doctoral proposal, we describe an approach to identify recurring, collective behavioral mechanisms in the collaborative interactions of Wikipedia editors that have the potential to undermine the ideals of quality, neutrality and completeness of article content. We outline how we plan to parametrize these patterns in order to understand their emergence and evolution and measure their eff...
متن کامل